Knowledge graph (KG) link prediction aims to infer new facts based on existing facts in the KG. Recent studies have shown that using the graph neighborhood of a node via graph neural networks (GNNs) provides more useful information compared to just using the query information. Conventional GNNs for KG link prediction follow the standard message-passing paradigm on the entire KG, which leads to over-smoothing of representations and also limits their scalability. On a large scale, it becomes computationally expensive to aggregate useful information from the entire KG for inference. To address the limitations of existing KG link prediction frameworks, we propose a novel retrieve-and-read framework, which first retrieves a relevant subgraph context for the query and then jointly reasons over the context and the query with a high-capacity reader. As part of our exemplar instantiation for the new framework, we propose a novel Transformer-based GNN as the reader, which incorporates graph-based attention structure and cross-attention between query and context for deep fusion. This design enables the model to focus on salient context information relevant to the query. Empirical results on two standard KG link prediction datasets demonstrate the competitive performance of the proposed method.
translated by 谷歌翻译
全向视频中的光流估计面临两个重要问题:缺乏基准数据集以及调整基于视频的方法以适应全向性质的挑战。本文提出了第一个具有360度视野Flow360的感知上天然合成的全向基准数据集,其中有40个不同的视频和4,000个视频帧。我们在数据集和现有的光流数据集之间进行了全面的特征分析和比较,这些数据集表现出感知现实主义,独特性和多样性。为了适应全向性质,我们提出了一个新颖的暹罗表示学习框架(SLOF)。我们以对比度的方式训练我们的网络,并结合了对比度损失和光流损失的混合损失函数。广泛的实验验证了所提出的框架的有效性,并在最新方法中显示出40%的性能提高。我们的Flow360数据集和代码可在https://siamlof.github.io/上找到。
translated by 谷歌翻译
旨在累积学习的认知架构必须提供必要的信息和控制结构,以允许代理商从他们的经验中逐步学习。这涉及管理代理人的目标,并在其感知 - 认知信息堆栈中连续将感官信息与这些联系起来。学习代理的环境越多,越一般,灵活的必须是这些机制来处理更广泛的相关模式,任务和目标结构。虽然许多研究人员同意不同水平的信息可能在其化妆和结构和处理机制中不同,但对这些差异的详情同意通常不在研究界中共享。已经提出了二进制处理架构(通常称为System-1和系统-2)作为低级别信息的认知处理模型。我们以这种方式不存在认知并不是二进制文件,并且任何抽象级别的知识都涉及我们所指的是神经组织信息的信息,这意味着高水平和低级的数据必须包含符号和亚象的信息。此外,我们认为高和低水平数据抽象的处理之间的主要区分因子可以很大程度上归因于所涉及的注意机制的性质。我们描述了此观点背后的关键论据,并审查文献中的相关证据。
translated by 谷歌翻译
由于我们是婴儿,我们直观地发展了与视觉,音频和文本等不同认知传感器的输入相关联的能力。然而,在机器学习中,这种跨模型学习是一种非活动任务,因为不同的方式没有均匀性质。以前的作品发现,应该有不同的方式存在桥梁。从神经病学和心理学的角度来看,人类有能力将一种模态与另一个方式联系起来,例如,将一只鸟的图片与歌唱的唯一听证者相关联,反之亦然。机器学习算法是否可能恢复给定音频信号的场景?在本文中,我们提出了一种新型级联关注的残留甘(Car-GaN),旨在重建给定相应的音频信号的场景。特别地,我们介绍残留物模块,以逐渐降低不同方式之间的间隙。此外,具有新型分类损失函数的级联注意网络旨在解决跨模型学习任务。我们的模型在高级语义标签域中保持一致性,并且能够平衡两种不同的模式。实验结果表明,我们的模型在具有挑战性的子URMP数据集上实现了最先进的跨模型视听生成。代码将在https://github.com/tuffr5/car-gan中获得。
translated by 谷歌翻译
Compliance in actuation has been exploited to generate highly dynamic maneuvers such as throwing that take advantage of the potential energy stored in joint springs. However, the energy storage and release could not be well-timed yet. On the contrary, for multi-link systems, the natural system dynamics might even work against the actual goal. With the introduction of variable stiffness actuators, this problem has been partially addressed. With a suitable optimal control strategy, the approximate decoupling of the motor from the link can be achieved to maximize the energy transfer into the distal link prior to launch. However, such continuous stiffness variation is complex and typically leads to oscillatory swing-up motions instead of clear launch sequences. To circumvent this issue, we investigate decoupling for speed maximization with a dedicated novel actuator concept denoted Bi-Stiffness Actuation. With this, it is possible to fully decouple the link from the joint mechanism by a switch-and-hold clutch and simultaneously keep the elastic energy stored. We show that with this novel paradigm, it is not only possible to reach the same optimal performance as with power-equivalent variable stiffness actuation, but even directly control the energy transfer timing. This is a major step forward compared to previous optimal control approaches, which rely on optimizing the full time-series control input.
translated by 谷歌翻译
Periocular refers to the region of the face that surrounds the eye socket. This is a feature-rich area that can be used by itself to determine the identity of an individual. It is especially useful when the iris or the face cannot be reliably acquired. This can be the case of unconstrained or uncooperative scenarios, where the face may appear partially occluded, or the subject-to-camera distance may be high. However, it has received revived attention during the pandemic due to masked faces, leaving the ocular region as the only visible facial area, even in controlled scenarios. This paper discusses the state-of-the-art of periocular biometrics, giving an overall framework of its most significant research aspects.
translated by 谷歌翻译
Cell-free multi-user multiple input multiple output networks are a promising alternative to classical cellular architectures, since they have the potential to provide uniform service quality and high resource utilisation over the entire coverage area of the network. To realise this potential, previous works have developed radio resource management mechanisms using various optimisation engines. In this work, we consider the problem of overall ergodic spectral efficiency maximisation in the context of uplink-downlink data power control in cell-free networks. To solve this problem in large networks, and to address convergence-time limitations, we apply scalable multi-objective Bayesian optimisation. Furthermore, we discuss how an intersection of multi-fidelity emulation and Bayesian optimisation can improve radio resource management in cell-free networks.
translated by 谷歌翻译
Early recognition of clinical deterioration (CD) has vital importance in patients' survival from exacerbation or death. Electronic health records (EHRs) data have been widely employed in Early Warning Scores (EWS) to measure CD risk in hospitalized patients. Recently, EHRs data have been utilized in Machine Learning (ML) models to predict mortality and CD. The ML models have shown superior performance in CD prediction compared to EWS. Since EHRs data are structured and tabular, conventional ML models are generally applied to them, and less effort is put into evaluating the artificial neural network's performance on EHRs data. Thus, in this article, an extremely boosted neural network (XBNet) is used to predict CD, and its performance is compared to eXtreme Gradient Boosting (XGBoost) and random forest (RF) models. For this purpose, 103,105 samples from thirteen Brazilian hospitals are used to generate the models. Moreover, the principal component analysis (PCA) is employed to verify whether it can improve the adopted models' performance. The performance of ML models and Modified Early Warning Score (MEWS), an EWS candidate, are evaluated in CD prediction regarding the accuracy, precision, recall, F1-score, and geometric mean (G-mean) metrics in a 10-fold cross-validation approach. According to the experiments, the XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Reinforcement learning is a machine learning approach based on behavioral psychology. It is focused on learning agents that can acquire knowledge and learn to carry out new tasks by interacting with the environment. However, a problem occurs when reinforcement learning is used in critical contexts where the users of the system need to have more information and reliability for the actions executed by an agent. In this regard, explainable reinforcement learning seeks to provide to an agent in training with methods in order to explain its behavior in such a way that users with no experience in machine learning could understand the agent's behavior. One of these is the memory-based explainable reinforcement learning method that is used to compute probabilities of success for each state-action pair using an episodic memory. In this work, we propose to make use of the memory-based explainable reinforcement learning method in a hierarchical environment composed of sub-tasks that need to be first addressed to solve a more complex task. The end goal is to verify if it is possible to provide to the agent the ability to explain its actions in the global task as well as in the sub-tasks. The results obtained showed that it is possible to use the memory-based method in hierarchical environments with high-level tasks and compute the probabilities of success to be used as a basis for explaining the agent's behavior.
translated by 谷歌翻译